PCNN: Projection Convolutional Neural Networks
51
FIGURE 3.12
In PCNNs, a new discrete backpropagation via projection is proposed to build binarized neu-
ral networks in an end-to-end manner. Full-precision convolutional kernels Cl
i are quantized
by projection as ˆCl
i,j. Due to multiple projections, the diversity is enriched. The resulting
kernel tensor Dl
i is used the same as in conventional ones. Both the projection loss Lp and
the traditional loss Ls are used to train PCNNs. We illustrate our network structure Basic
Block Unit based on ResNet, and more specific details are shown in the dotted box (pro-
jection convolution layer). © indicates the concatenation operation on the channels. Note
that inference does not use projection matrices W l
j and full-precision kernels Cl
i.
ible projection scheme, we obtain diverse binarized models with higher performance than
the previous ones.
3.5.1
Projection
In our work, we define the quantization of the input variable as a projection onto a set;
Ω := {a1, a2, ..., aU},
(3.28)
where each element ai, i = 1, 2, ..., U satisfies the constraint a1 < a2 < ... < aU, and is the
discrete value of the input variable. Then we define the projection of x ∈R onto Ω as
PΩ(ω, x) = arg min
ai ∥ω ◦x −ai∥, i ∈{1, ..., U},
(3.29)
where ω is a projection matrix and ◦denotes the Hadamard product. Equation 3.29 indicates
that the projection aims to find the closest discrete value for each continuous value x.
3.5.2
Optimization
Minimizing f(x) based on the discrete optimization or integer programming method, whose
variables are restricted to discrete values, becomes more challenging when training a